HEAD ======= >>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98
library(DT)
library(ggplot2)
library(xfun)
library(SignalingProfiler)
library(tidyverse)
SignalingProfiler 2.0 is a modelling pipeline described in Venafra et al., 2024.
The SignalingProfiler package provides a flexible way to create signaling networks integrating transcriptomic, proteomic and phosphoproteomic data.
Briefly, the method allows the user to build a mechanistic model representing the signal remodeling upon a treatment or in a disease condition. The model starts with perturbed receptor(s) and ends with proteins whose activity is modulated (up- or down-regulated) among your two conditions (e.g. treated vs untreated, disease vs healthy).
The SignalingProfiler pipeline is modular. You can exploit the method to just perform one of the proposed tasks.
The pipeline is divided in three main steps:
Step 1: infer the activity of key signaling proteins from multi-omics data;
Step 2: connect proteins in a context-specific network;
Step 3: connect phenotypes to the final model to make it functionally interpretable;
At the end of the pipeline, you can extract functional circuits connecting perturbed nodes to crucial phenotypes for your context.
devtools::install_github('https://github.com/SaccoPerfettoLab/SignalingProfiler/')
devtools::install_github('https://github.com/SaccoPerfettoLab/SignalingProfiler/')
The most important prerequisites for SignalingProfiler is the ILP solver required by CARNIVAL algorithm to optimize your model against experimental data. An exhaustive guide about installation can be found in Prerequisites section of CARNIVAL GitHub page here.
If you want to create networks with SignalingProfiler install the ILP solver
To run the last step of SignalingProfiler is required to properly configure the reticulate environment. To do so, run the following command the first time you import SignalingProfiler.
install_sp_py()
SignalingProfiler requires processed tables derived from your multi-omics data. Here, we describe the information needed by the package.
Your transcriptomics and proteomics tables should contain the following columns:
gene_name and Uniprot ID for proteomics; SignalingProfiler is structured having the gene_name as key;
difference: representing the fold-change in gene expression among two conditions; e.g. the difference between treated and control sample;
The next columns regards parameters linked to the statistical test performed among your conditions:
logpval: the -Log(p-value) associated to your fold-change;
significant: a column containing a ‘+’ if the gene is significantly modulated, NA otherwise;
Transcriptomics data example
data("tr_toy_df")
DT::datatable(tr_toy_df, options = list(pageLength = 3))
=======
data("tr_toy_df")
DT::datatable(tr_toy_df, options = list(pageLength = 3))
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98
Proteomics data example
data("prot_toy_df")
DT::datatable(prot_toy_df, options = list(pageLength = 3))
<<<<<<< HEAD
=======
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98
Before using Signaling Profiler, you have to accurately process you phosphoproteomic data set. The most important step is to have ONLY ONE OCCURENCE OF EACH PHOSPHOSITE. If you have a phosphosite quantified more than once, you have to choose the most reliable quantification (usually the lowest multiplicity).
Your phosphoproteomics table should contain:
UniProt ID and gene_name: reporting the UNIPROT ID and the gene name respectively; again, choose just one gene name.
aminoacid and position: reporting the phosphosite with the single letter notation and the position in the protein sequence;
sequence window: the phosphopeptide centered on the modified residue; it should be at least a 15-mer, compatible with SIGNOR and PsP database notation;
difference: representing the fold-change in gene expression among two conditions; e.g. the difference between treated and control sample;
logpval: the -Log(p-value) associated to the statistical test assessing the significance of your comparisons;
significant: a column containing a ‘+’ if the gene is significantly modulated, NA otherwise;
Phosphoproteomics data example
data("phospho_toy_df")
DT::datatable(phospho_toy_df, options = list(pageLength = 3))
<<<<<<< HEAD
=======
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98
The output of SignalingProfiler is a list composed of different elements toy_sp_output:
Network in igraph format toy_sp_output$igraph_network
Table of the proteins (nodes) included in the network with attributes toy_sp_output$nodes_df
Protein identifiers: UniProt ID and gene name;
Two different activity scores:
Molecular function annotation (mf);
data("toy_sp_output")
DT::datatable(toy_sp_output$nodes_df, options = list(pageLength = 3))
<<<<<<< HEAD
=======
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98
Table of edges included in the network with biologically relevant attributes toy_sp_output$edges_df
Uniprot IDs of the nodes connected (source and target);
sign reporting if the interaction is activatory (1) or inhibitory (-1);
carnival_weight: since CARNIVAL may return more than one solustion, this attribute reports the frequency of the edge in the different solutions returned (e.g. 100 means that the edge is present in all the solutions!);
annotation from phosphoproteomics data:
aminoacid: reports the phosphosites in SIGNOR and PsP matching that interaction; if the phosphosite has a * it means that it is significantly modulated among your two conditions;
FC: reports the fold-change of phosphorylation among your two conditions of the phosphosite matched on the interaction;
is quantified is TRUE if you quantified the phosphosite matching that edge in your phosphoproteomic data set;
is significant is TRUE if the quantified phosphosite matching that edge in your phosphoproteomic data set is significantly modulated;
data("toy_sp_output")
DT::datatable(toy_sp_output$edges_df, options = list(pageLength = 3))
<<<<<<< HEAD
=======
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98
SignalingProfiler allows the user to infer protein activities from omic data comining:
Footprint-based analysis: infer the activity of a transcription factor (or kinases and phosphatases) from the modulation of its targets (transcripts/phosphosites);
Phosphoscore analysis: infer the activity of phosphoproteins from the modulation of their regulatory phosphosites;
Footprint-based analysis derives protein activities from the modulation of downstream targets using the VIPER algorithm. In particular:
TFEA (Transcription Factor Enrichment Analysis) infers transcription factors activity from the modulation of target genes in transcriptomics data;
KSEA (Kinase Substrates Enrichment Analysis) infers kinases/phosphatases activity from the modulation of target phosphosites in phosphoproteomics data;
The set of regulated analytes of a TF/KIN/PHOS is called regulon.
SignalingProfiler built-in regulons
TFs-genes regulons (tfea_db_human, tfea_db_human_collectri) are retrieved from DoRothEA R Package (confidence: A) and from SIGNOR or from Collectri, a new comprehensive resource.
KIN/PHOS-phosphosites regulons (ksea_db_human) are derived from:OmniPath and SIGNOR; given the recent publication of the Serine-Threonine-Tyrosine Kinome Atlas we integrated high confidence regulons in the ksea_db derived from this work. If you want to extend your data with the atlas, set integrated_regulons = TRUE.
Create you own TFEA/KSEA custom regulons
SignalingProfiler 2.0 updates its regulons (regulatory networks) twice a year. If the user want to use updated regulons, we implemented functions taking the information from different resources and making them SignalingProfiler compliant.
tfea_regulons <- create_tfea_regulons(resources = c('SIGNOR', 'Dorothea'),
organism = 'human')
write_tsv(tfea_regulons, '../custom_regulons/tfea_custom_regulons.tsv')
ksea_regulons <- create_ksea_regulons(resources = c('SIGNOR', 'Omnipath'),
organism = 'human')
write_tsv(ksea_regulons, '../custom_regulons/ksea_custom_regulons.tsv')
tf_activity_foot <- run_footprint_based_analysis(
omic_data = tr_toy_df,
analysis = 'tfea',
organism = 'human',
reg_minsize = 10,
exp_sign = FALSE,
collectri = FALSE,
hypergeom_corr = TRUE,
GO_annotation = TRUE,
correct_proteomics = TRUE,
prot_df = prot_toy_df,
custom = FALSE,
custom_path = NULL)
# Set custom = TRUE and specify your custom regulons path in
# custom_path = '../custom_regulons/tfea_custom_regulons.tsv',
# if you have custom regulons!
kin_phos_activity_foot <- run_footprint_based_analysis(
omic_data = phospho_toy_df,
analysis = 'ksea',
organism = 'human',
reg_minsize = 5,
exp_sign = FALSE,
integrated_regulons = TRUE,
hypergeom_corr = TRUE,
GO_annotation = TRUE,
correct_proteomics = TRUE,
prot_df = prot_toy_df,
custom = FALSE,
custom_path = NULL)
# Set custom = TRUE and specify your custom regulons path in
# custom_path = '../custom_regulons/tfea_custom_regulons.tsv',
# if you have custom regulons!
PhosphoScore computation exploits:
the experimental fold-change of the phosphosites in phosphoproteomics;
the regulatory role of phosphosites (activating or inhibiting) annotated in SIGNOR and PsP databases;
The information about how phosphosites regulate the protein they are on is retrieved from SIGNOR and PsP databases (good_phos_df_human_act, good_phos_df_human_act_aapos). The key for identifying the phosphosite can be either Gene-Peptide or Gene-AAPos.
The information about how phosphosites regulate the protein they are on is retrieved from SIGNOR and PsP databases (good_phos_df_human_act, good_phos_df_human_act_aapos).
The key for identifying the phosphosite can be either Gene-Peptide or Gene-AAPos.
SignalingProfiler updates its regulons (regulatory networks) twice a year. If the user want to use updated information it can create its own information about regulatory phosphosites. Importantly, if you want to include PsP information you have to manually download from PsP database the Regulatory_sites and Kinase_Substrates file.
reg_phosphosites <- get_phosphoscore_info(resources = c('SIGNOR', 'PsP'),
organism = 'human',
psp_reg_site_path = '.Regulatory_sites',
psp_kin_sub_path = '.Kinase_Substrates',
aa_pos = FALSE, # TRUE if you want Gene-AA-Pos as key
only_activatory = TRUE)
write_tsv(reg_phosphosites, '../custom_regulons/act_phosphosites.tsv')
phosphoscore_df <- phosphoscore_computation(phosphoproteomic_data = phospho_toy_df,
organism = 'human',
activatory = TRUE ,
GO_annotation = TRUE,
custom = FALSE,
custom_path = NULL)
# Set custom = TRUE and specify your custom phosphosites path in
# custom_path = '../custom_regulons/act_phosphosites.tsv',
# if you have custom phosphosites!
Transcription factors, kinases and phosphatases could have an activity derived from both their targets (transcripts/phosphosites) and from their regulatory phosphosites. The final score computation integrates these two scores. Importantly, the computation should be done separately for transcription factors and kinases/phosphatases.
# for transcription factors
combined_tf <- combine_footprint_and_phosphoscore(
footprint_output = tf_activity_foot,
phosphoscore_df = phosphoscore_df,
analysis = 'tfea')
toy_tf <- combined_tf
# for kinases and phosphatases
combined_kin_phos <- combine_footprint_and_phosphoscore(
footprint_output = kin_phos_activity_foot,
phosphoscore_df = phosphoscore_df,
analysis = 'ksea')
toy_kin <- combined_kin_phos
Then, consider the phosphorylated proteins of the PhosphoScore technique that are not TF/KIN/PHOS as OTHER:
toy_other <- phosphoscore_df %>%
dplyr::filter(mf == 'other') %>%
dplyr::rename(final_score = phosphoscore) %>%
dplyr::mutate(method = 'PhosphoScore')
Then, combine all the information in a final activity table for the next steps.
toy_prot_activity_df <- dplyr::bind_rows(toy_tf, toy_kin, toy_other) %>%
select(UNIPROT, gene_name, mf, final_score, method)
data('toy_prot_activity_df')
DT::datatable(toy_prot_activity_df, options = list(pageLength = 3))
<<<<<<< HEAD
=======
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98
toy_prot_activity_df$mf <- factor(
toy_prot_activity_df$mf,
levels = c('tf', 'kin', 'phos', 'other'))
ggplot(toy_prot_activity_df,
aes(x = fct_reorder(gene_name, final_score), y = final_score)) +
geom_bar(stat = "identity", aes(fill = final_score), alpha = 1) +
facet_grid(mf ~ ., space ='free_y', scales = 'free') +
theme_classic() +
scale_fill_gradient2(low = "#89ABD6", high = "#FF6A72",
mid = "whitesmoke", midpoint = 0) +
theme(text = element_text(size= 6),
axis.text.x = element_text(angle = 0, vjust = 1, size = 5),
legend.position="bottom",
legend.title = element_blank()) +
xlab('') +
coord_flip()+
ylab('Final score')
<<<<<<< HEAD
SignalingProfiler offers different techniques to derive the activity of proteins. It is up to you and the data you have the choice of the methods and the combination of methods to use. Importantly, the analysis is flexible and powerful. For example, if you have only phosphoproteomics data, you can still infer the activity of transcription factors using their regulatory phosphosites in the PhosphoScore analysis!!
At the end of Step 1, SignalingProfiler outputs a single table listing all proteins with their inferred activity and molecular function, each represented by a unique combined score from one or both methods.
In this step, SignalingProfiler connects the proteins of Step1 to user-specified starting point(s) (e.g., signaling receptors). If you do not have a starting point, you can still use SignalingProfiler using the inverseCARNIVAL algorithm.
To connect the proteins, SignalingProfiler uses the prior information about molecular interactions stored in SIGNOR and PhosphoSitePlus databases. Briefly, every PKN interaction is binary, directed (has a regulator and a target of the regulation), and signed (representing either an up- or a down-regulation).
The information is stored in different built-in objects that can contain only direct interactions (PKN_human_dir) or the addition of indirect transcriptional regulations (PKN_human_ind).
Custom PKN SignalingProfiler updates its prior knowledge network twice a year. We offer the user an interface to Omnipath and SIGNOR to assemble an updated PKN in SignalingProfiler compliant format. Remember, if you want to include PsP you have to provide its manually downloaded files!
=======Custom PKN SignalingProfiler updates its prior knowledge network twice a year. We offer the user an interface to Omnipath and SIGNOR to assemble an updated PKN in SignalingProfiler compliant format. Remember, if you want to include PsP you have to provide its manually downloaded files!
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98
create_PKN(database = c('SIGNOR', 'PsP'), #'Omnipath is also supported
direct = TRUE,
organism = 'human',
psp_reg_site_path = './Regulatory_sites',
psp_kin_sub_path = './Kinase_Substrate_Dataset',
file_path = './custom_regulons/custom_PKN.RDS')
The first step to build a network is choosing the Prior Knowledge Network (PKN).
SignalingProfiler allow you to use built-in PKNs for mouse and human or to exploit a custom PKN. For the custom PKN you have to provide the path of the file.
For human you can choose between a PKN containing the interactions from only databases, or a PKN containing interactions also derived from the kinome atlas, specifying the with_atlas = TRUE. In our benchmarking it resulted in a network with an artificial number of hubs, but it could be still useful if you are interested in not known phosphorylation events!
PKN_table <- choose_PKN(organism = 'human',
with_atlas = FALSE,
direct = TRUE,
custom = FALSE,
custom_path = NULL)
# If you want to use your custom PKN put custom = TRUE and
# custom_path = '../custom_regulons/custom_PKN.RDS'
Since PKN interactions come from different experimental systems, a good practice is to keep only the ones involving analytes expressed in your system, so quantified in at least one of your omics data sets.
PKN_expressed <- preprocess_PKN(
omics_data = list(tr_toy_df, prot_toy_df, phospho_toy_df),
PKN_table = PKN_table)
To speed up the analysis, we want to restrict the PKN interactome to the neighborhood of the inferred proteins, generating a naïve network. It is a graph connecting your proteins through ALL the possible SHORTEST causal paths of a user-specified length (1 to 4). We don’t consider the causality of the edges, just distance. As such, in the naïve network there may be two paths with same length, but different causal meaning.
SignalingProfiler offers three types of naïve networks, but the benchmarking process revealed the two-layered as optimal. We suggest to explore also the three-layered!
# divide proteins according to the molecular function
kin_phos_other <- toy_prot_activity_df %>%
dplyr::filter(mf %in% c('kin', 'phos', 'other'))
tfs <- toy_prot_activity_df %>%
dplyr::filter(mf == 'tf')
# create the naïve network
two_layers_toy <- two_layer_naive_network(starts_gn = c('MTOR', 'AMPK'),
intermediate_gn = kin_phos_other$gene_name,
targets_gn = tfs$gene_name,
PKN_table = PKN_expressed, #or PKN_human_dir
max_length_1 = 3,
max_length_2 = 4,
connect_all = TRUE,
rds_path = './two_layers_naive.rds',
sif_path = './two_layers_naive.sif')
CARNIVAL algorithm optimizes the naïve network in a context-specific model. This means that CARNIVAL will retrieve from the naive network only the edges coherent with context-specific activity of proteins from the Step 1.
Once you have created the naïve network, you have to prepare the result for the next step.
two_layers_toy <- readRDS('./two_layers_naive.rds')
receptor_list <- list('MTOR' = -1, 'AMPK' = 1)
carnival_input_toy <- prepare_carnival_input(two_layers_toy,
toy_prot_activity_df,
receptor_list,
organism = 'human')
DT::datatable(carnival_input_toy, options = list(pageLength = 3))
<<<<<<< HEAD
=======
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98
CARNIVAL requires the ILP solver to be installed. We suggest cplex by IBM. A similar performance is observed using the Gurobi solver. To run the analysis you have to specify the name of the solver, so that we can set the default option for the run.
solver = 'cplex'
carnival_options = default_CARNIVAL_options(solver)
Then, you have to manually specify the ILP solver path.
#for Windows
carnival_options$solverPath = "C:/Program Files/CPLEX_solver/cplex/bin/x64_win64/cplex.exe"
#for macOS
carnival_options$solverPath = '/Applications/CPLEX_Studio221/cplex/bin/x86-64_osx/cplex'
#for Linux
carnival_options$solverPath = '/opt/ibm/ILOG/CPLEX_Studio2211/cplex/bin/x86-64_linux/cplex'
# In Windows try the Gurobi ILP solver
# carnival_options = default_CARNIVAL_options(
# solver = "gurobi",
# gurobi_path = "C:\\Users\\sgual\\Downloads\\win64\\bin\\gurobi_cl.exe")
You can change the options used by the solver (try first with the default).
carnival_options$mipGAP <- 0.05
carnival_options$timelimit <- 3600
carnival_options$cplexMemoryLimit <- 8192
To run CARNIVAL algorithm:
create a tibble with the starting point(s) of the model (source_df);
Create a tibble with the ending points of the model (target_df);
Provide the naïve network in SIF format (Step 2 creates the file in your working directory) (naïve_network;
Provide the whole inferred proteins tibble (proteins_df);
Provide the CARNIVAL options you created in the former step (carnival_options);
According to the PKN you used you have to set the direct and with_atlas parameters in the function.
SignalingProfiler offers 4 versions of CARNIVAL algorithm for the optimization. Three version work with a known starting point of the model, one without (inverseCARNIVAL). In the benchmarking, we identified the two-shots CARNIVAL as the optimal way of running the analysis in terms of recapitulated proteins and phosphorylations within the network.
In this vignette, we propose the two-shots CARNIVAL. In this implementation, CARNIVAL:
Connect the pertubed node(s) to kinases, phosphatases and other signaling proteins;
Connect the kinases, phosphatases and other signaling proteins to transcription factors.
Compute the union of the two graphs
# FIRST RUN: RECEPTOR to KIN, PHOS, OTHERS
receptors_df <- carnival_input_toy %>% dplyr::filter(mf == 'rec')
target1_df <- carnival_input_toy %>%
dplyr::filter(mf %in% c('kin', 'phos', 'other'))
two_layers_toy_df <- readr::read_tsv('./two_layers_naive.sif',
col_names = c('source', 'interaction', 'target'))
output1 <- run_carnival_and_create_graph(
source_df = receptors_df,
target_df = target1_df,
naive_network = unique(two_layers_toy_df),
proteins_df = carnival_input_toy,
organism = 'human',
carnival_options = carnival_options,
files = FALSE,
direct = FALSE,
with_atlas = TRUE)
# SECOND RUN: from KIN, PHOS, OTHERS to TFs
run1_output_nodes <- convert_output_nodes_in_next_input(output1)
source_df <- run1_output_nodes %>%
dplyr::filter(mf %in% c('kin', 'phos', 'other'))
target2_df <- carnival_input_toy %>%
dplyr::filter(mf == 'tf')
output2 <- run_carnival_and_create_graph(
source_df = source_df,
target_df = target2_df,
naive_network = unique(two_layers_toy_df),
proteins_df = carnival_input_toy,
organism = 'human',
carnival_options = carnival_options,
direct = FALSE,
with_atlas = TRUE,
files = FALSE)
# UNION OF RUN1 and RUN2 graphs
union <- union_of_graphs(graph_1 = output1$igraph_network,
graph_2 = output2$igraph_network,
proteins_df = carnival_input_toy,
files = TRUE,
path_sif = './optimized_network.sif',
path_rds = './optimized_object.rds')
After CARNIVAL optimization, SignalingProfiler maps phosphoproteomics data on the signaling edges of the model for a better biological interpretation of the result. In fact, since the network connects protein with different activity modulations, some interactions may represent phosphorylation events responsable for that.
optimized_object_rds <- readRDS('./optimized_object.rds')
toy_opt_network <- expand_and_map_edges(
optimized_object = optimized_object_rds,
organism = 'human',
phospho_df = phospho_toy_df,
files = TRUE,
direct = FALSE,
with_atlas = TRUE,
path_sif = 'optimized_network_validated.sif',
path_rds = 'optimized_object_validated.rds')
In the toy_opt_network dataset, we report an example of the Step 2 result. Here, it is reported the set of edges validated with phosphoproteomics data.
In the toy_opt_network dataset, we report an example of the Step 2 result.
Here, it is reported the set of edges validated with phosphoproteomics data.
data('toy_opt_network')
DT::datatable(toy_opt_network$edges_df %>%
dplyr::filter(is_quantified == TRUE),
options = list(pageLength = 5))
<<<<<<< HEAD
=======
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98
At the end of Step 2, SignalingProfiler outputs a sign-coherent network connecting inferred proteins from Step 1 with causal interactions from SIGNOR.
To make the model functionally interpretable, SignalingProfiler connects phenotypes to the model proteins and infer the activity of phenotypes integrating the signaling axes in the model contributing to their activation/inhibition.
To this aim, we exploit the ProxPath algorithm to connect SIGNOR phenotypes to the model proteins and the PhenoScore algorithm, to infer the phenotypes activity.
In the PhenoScore computation we indirectly connect proteins to the phenotypes. You can choose the path_length between proteins an phenotype, from 2 to 4.
Crucial concepts: - a protein can regulate a phenotype with multiple paths; - the phenotype will be regulated by activating and inhibitory paths;
Here there are the steps of the algorithm:
Select significant close phenotypes from ProxPath: select from the ProxPath table the significantly close proteins next to your phenotypes of interest (key parameters are zscore_threshold, stat)
Remove randomly regulated phenotypes: given your protein list (protein_df), we sample a random protein list with the same number of proteins and compute the number of activating and inhibitory paths over the phenotype of interest from both lists. If the random list produces the same number of paths of your input list over a phenotype of interest, the phenotype is discarded, because it is regulated at random (key parameters are:n_random, pvalue_threshold).
PhenoScore computation: we compute the average of the activity of proteins regulating the same phenotype, with three considerations: (i) if two independent regulators are connected in the network (sp_graph), consider the most downstream (remove_cascade = TRUE); (ii) we weight proteins according to their number of paths over the phenotype (since they are more resilient, node_idx = TRUE); (iii) the activity considered can be both experimentally derived or derived from CARNIVAL algorithm (use_carnival_activity = TRUE). :::
PhenoScore computation: we compute the average of the activity of proteins regulating the same phenotype, with three considerations: (i) if two independent regulators are connected in the network (sp_graph), consider the most downstream (remove_cascade = TRUE); (ii) we weight proteins according to their number of paths over the phenotype (since they are more resilient, node_idx = TRUE); (iii) the activity considered can be both experimentally derived or derived from CARNIVAL algorithm (use_carnival_activity = TRUE).
:::
To make the ProxPath protein-phenotypic relations context-specific, we remove interactions involving proteins not quantified in your proteomics data.
pheno_distances_table <- phenoscore_network_preprocessing(proteomics = prot_toy_df,
phospho = phospho_toy_df)
Check the available phenotypes from the proxpath_phenotypes table and create a vector of desired phenotypes. Alternatively, you can perform the analysis on all phenotypes.
data('proxpath_phenotypes')
DT::datatable(proxpath_phenotypes, options = list(pageLength = 3))
=======
Check the available phenotypes from the proxpath_phenotypes table and create a vector of desired phenotypes.
Alternatively, you can perform the analysis on all phenotypes.
data('proxpath_phenotypes')
DT::datatable(proxpath_phenotypes, options = list(pageLength = 3))
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98
# example phenotypic vector
desired_phenotypes <- c('APOPTOSIS',
'CELL_DEATH', 'CELL_GROWTH', 'CELL_CYCLE_BLOCK',
'DNA_REPAIR',
'PROLIFERATION', 'IMMORTALITY', 'SURVIVAL')
toy_phenoscore_output<- phenoscore_computation(
proteins_df = toy_opt_network$nodes_df,
desired_phenotypes = desired_phenotypes,
pheno_distances_table = NULL,
sp_graph = toy_opt_network$igraph_network,
# closeness of proteins to phenotypes
path_length = 3,
stat = 'mean',
zscore_threshold = -1.96,
# exclude random phenotypes
n_random = 1000,
pvalue_threshold = 0.05,
# optimized network specificity
remove_cascade = TRUE,
node_idx = FALSE,
use_carnival_activity = FALSE,
create_pheno_network = TRUE)
The object returned by phenoscore_computation function contains a barplot to readily interpret the cellular functions modulated by your network.
The object returned by phenoscore_computation function contains a barplot
to readily interpret the cellular functions modulated by your network.
data('toy_phenoscore_output')
phenoscore_df1 <- toy_phenoscore_output$table_phenotypes
phenoscore_df1 <- phenoscore_df1 %>%
dplyr::mutate(reg = ifelse(phenoscore < 0, 'down', 'up'))
color_list <- list('down' = '#407F7F', 'up' = '#D46A6A')
ggplot2::ggplot(phenoscore_df1,
ggplot2::aes(x = forcats::fct_reorder(EndPathways,
phenoscore),
y = phenoscore, fill = reg))+
ggplot2::geom_bar(stat = 'identity', alpha = 0.8)+
ggplot2::scale_fill_manual(values = color_list,
labels = names(color_list)) +
ggplot2::ggtitle(paste0('Phenoscore')) +
ggplot2::ylab("phenotype modulation") +
ggplot2::xlab("") +
ggplot2::theme_bw()+
ggplot2::theme(legend.title = element_blank(),
legend.position = 'none',
plot.title=element_text(hjust = 0.5),
panel.grid.minor = element_blank(),
axis.text.y = element_text( size = 14, face = 'bold'),
axis.text.x=element_text(size=10, angle=0, vjust=0.5, hjust=0.5))+
ggplot2::coord_flip()
<<<<<<< HEAD
You can also inspect the proteins in the network regulating INDIRECTLY each phenotype.
sub_table <- toy_phenoscore_output$table_regulators %>%
dplyr::select(EndPathways, Effect, regulators)
DT::datatable(sub_table, options = list(pageLength = 3))
<<<<<<< HEAD
=======
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98
To clearly visualize the network, we suggest to optimize the model for the last time on the phenotypic activity. Then you can use a built-in Cytoscape style to render the network. Note: open Cytoscape in your computer before this step!
=======To clearly visualize the network, we suggest to optimize the model for the last time on the phenotypic activity. Then you can use a built-in Cytoscape style to render the network. Note: open Cytoscape in your computer before this step!
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98
solver = 'cplex'
carnival_options <- default_CARNIVAL_options(solver)
opt1 <- optimize_pheno_network(sp_object = toy_phenoscore_output$sp_object_phenotypes,
organism = 'human',
phospho_df = phospho_toy_df,
carnival_options = carnival_options,
files = TRUE,
direct = TRUE,
with_atlas = FALSE,
path_sif = './pheno_opt1.sif',
path_rds = './pheno_opt1.rds')
opt1_graph <- format_for_visualization(opt1)
# Visualization in Cytoscape
RCy3::createNetworkFromIgraph(igraph=opt1$sp_object_phenotypes$igraph_network,
title = 'My model',
collection = 'My collection')
data_path <- system.file("extdata", "SP_pheno_layout.xml", package = "SignalingProfiler")
RCy3::importVisualStyles(filename = data_path)
RCy3::setVisualStyle('SP_pheno_layout')
Examples of SignalingProfiler 2.0 resulting network in NDEX:
SignalingProfiler offers the possibily to extract functional circuits from a desired start nodes to a set of end nodes (e.g., phenotypes). k is the max path length between two proteins in the circuit.
opt1_list <- readRDS('./pheno_opt1.rds')
opt1_list$sp_object_phenotypes <- format_for_visualization(opt1_list)
# Circuits on example phenotypes
phenotypes <- c('PROLIFERATION', 'AUTOPHAGY')
start_nodes <- c('MTOR', 'AMPK')
# Circuit on autophagy, ribosome and translation
circuit <- pheno_to_start_circuit(SP_object = opt1_list$sp_object_phenotypes,
start_nodes = start_nodes,
phenotypes = phenotypes,
k = 5,
start_to_top = TRUE)
RCy3::createNetworkFromIgraph(igraph=circuit,
title = 'My network',
collection = 'Metformin')
RCy3::setVisualStyle('SP_pheno_layout')
<<<<<<< HEAD
Here, we report a functional circuit from the Metformin use case, where MCF7 cell line was treated with Metformin. 
Here, we report a functional circuit from the Metformin use case,
where MCF7 cell line was treated with Metformin.
This is a tutorial for the usage of SignalingProfiler. For any doubt please contact us at: veronica.venafra97@gmail.it, livia.perfetto@uniroma1.it, francesca.sacco@uniroma2.it.
=======This is a tutorial for the usage of SignalingProfiler. For any doubt please contact us at: veronica.venafra97@gmail.it, livia.perfetto@uniroma1.it, francesca.sacco@uniroma2.it.
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98xfun::session_info()
R version 4.4.2 (2024-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.3.1
Locale: en_US.UTF-8 / en_US.UTF-8 / en_US.UTF-8 / C / en_US.UTF-8 / en_US.UTF-8
Package version:
<<<<<<< HEAD
annotate_1.84.0 AnnotationDbi_1.68.0 askpass_1.2.1
backports_1.5.0 base64enc_0.1.3 bcellViper_1.42.0
BH_1.87.0.1 Biobase_2.66.0 BiocGenerics_0.52.0
BiocManager_1.30.25 BiocParallel_1.40.0 BiocStyle_2.34.0
Biostrings_2.74.1 bit_4.5.0.1 bit64_4.6.0-1
blob_1.2.4 bookdown_0.42 broom_1.0.7
bslib_0.9.0 cachem_1.1.0 callr_3.7.6
CARNIVAL_2.16.0 cellranger_1.1.0 checkmate_2.3.2
class_7.3-23 cli_3.6.4 clipr_0.8.0
cluster_2.1.8 codetools_0.2.20 colorspace_2.1-1
compiler_4.4.2 conflicted_1.2.0 cosmosR_1.14.0
cpp11_0.5.1 crayon_1.5.3 crosstalk_1.2.1
curl_6.2.0 data.table_1.16.4 DBI_1.2.3
dbplyr_2.5.0 decoupleR_2.12.0 digest_0.6.37
dorothea_1.18.0 dplyr_1.1.4 DT_0.33
dtplyr_1.3.1 e1071_1.7-16 evaluate_1.0.3
fansi_1.0.6 farver_2.1.2 fastmap_1.2.0
fontawesome_0.5.3 forcats_1.0.0 formatR_1.14
fs_1.6.5 futile.logger_1.4.3 futile.options_1.0.1
gargle_1.5.2 generics_0.1.3 GenomeInfoDb_1.42.3
GenomeInfoDbData_1.2.13 ggforce_0.4.2 ggplot2_3.5.1
glue_1.8.0 googledrive_2.1.1 googlesheets4_1.1.1
graph_1.84.1 graphics_4.4.2 grDevices_4.4.2
grid_4.4.2 GSEABase_1.68.0 gtable_0.3.6
haven_2.5.4 here_1.0.1 highr_0.11
hms_1.1.3 htmltools_0.5.8.1 htmlwidgets_1.6.4
httpuv_1.6.15 httr_1.4.7 ids_1.0.1
igraph_2.1.4 IRanges_2.40.1 isoband_0.2.7
jquerylib_0.1.4 jsonlite_1.8.9 KEGGREST_1.46.0
kernlab_0.9-33 KernSmooth_2.23-26 knitr_1.49
labeling_0.4.3 lambda.r_1.2.4 later_1.4.1
lattice_0.22-6 lazyeval_0.2.2 lifecycle_1.0.4
logger_0.4.0 lpSolve_5.6.23 lubridate_1.9.4
magick_2.8.5 magrittr_2.0.3 MASS_7.3-64
Matrix_1.7-2 memoise_2.0.1 methods_4.4.2
mgcv_1.9.1 mime_0.12 mixtools_2.0.0
modelr_0.1.11 munsell_0.5.1 nlme_3.1-167
OmnipathR_3.14.0 openssl_2.3.2 parallel_4.4.2
parallelly_1.42.0 permute_0.9.7 pheatmap_1.0.12
pillar_1.10.1 pkgconfig_2.0.3 plogr_0.2.0
plotly_4.10.4 plyr_1.8.9 png_0.1-8
polyclip_1.10.7 prettyunits_1.2.0 processx_3.8.5
progress_1.2.3 promises_1.3.2 proxy_0.4-27
ps_1.8.1 purrr_1.0.4 R.methodsS3_1.8.2
R.oo_1.27.0 R.utils_2.12.3 R6_2.6.1
ragg_1.3.3 rappdirs_0.3.3 RColorBrewer_1.1.3
Rcpp_1.0.14 RcppEigen_0.3.4.0.2 RcppTOML_0.2.2
readr_2.1.5 readxl_1.4.3 rematch_2.0.0
rematch2_2.1.2 reprex_2.1.1 reticulate_1.42.0
rjson_0.2.23 rlang_1.1.5 rmarkdown_2.29
rprojroot_2.0.4 RSQLite_2.3.9 rstudioapi_0.17.1
RVenn_1.1.0 rvest_1.0.4 S4Vectors_0.44.0
sass_0.4.9 scales_1.3.0 segmented_2.1-3
selectr_0.4.2 SignalingProfiler_0.99.0 snow_0.4.4
splines_4.4.2 stats_4.4.2 stats4_4.4.2
stringi_1.8.4 stringr_1.5.1 survival_3.8-3
sys_3.4.3 systemfonts_1.2.1 textshaping_1.0.0
tibble_3.2.1 tidyr_1.3.1 tidyselect_1.2.1
tidyverse_2.0.0 timechange_0.3.0 tinytex_0.54
tools_4.4.2 tweenr_2.0.3 tzdb_0.4.0
UCSC.utils_1.2.0 utf8_1.2.4 utils_4.4.2
uuid_1.2.1 vctrs_0.6.5 vegan_2.6.10
viper_1.40.0 viridisLite_0.4.2 visNetwork_2.1.2
vroom_1.6.5 withr_3.0.2 xfun_0.50
XML_3.99.0.18 xml2_1.3.6 xtable_1.8.4
XVector_0.46.0 yaml_2.3.10 zip_2.3.2
zlibbioc_1.52.0
=======
abind_1.4.8 ade4_1.7.23
airr_1.5.0 alakazam_1.3.0
annotate_1.84.0 AnnotationDbi_1.68.0
ape_5.8.1 askpass_1.2.1
backports_1.5.0 base64enc_0.1.3
bcellViper_1.42.0 BH_1.87.0.1
Biobase_2.66.0 BiocGenerics_0.52.0
BiocManager_1.30.25 BiocParallel_1.40.0
BiocStyle_2.34.0 Biostrings_2.74.1
bit_4.5.0.1 bit64_4.6.0.1
bitops_1.0.9 blob_1.2.4
bookdown_0.42 boot_1.3.31
broom_1.0.7 bslib_0.9.0
cachem_1.1.0 callr_3.7.6
car_3.1.3 carData_3.0.5
CARNIVAL_2.16.0 cellranger_1.1.0
class_7.3-23 cli_3.6.4
clipr_0.8.0 cluster_2.1.8
codetools_0.2.20 colorspace_2.1-1
compiler_4.4.2 conflicted_1.2.0
corrplot_0.95 cosmosR_1.14.0
cowplot_1.1.3 cpp11_0.5.1
crayon_1.5.3 crosstalk_1.2.1
curl_6.2.0 data.table_1.16.4
data.tree_1.1.0 DBI_1.2.3
dbplyr_2.5.0 decoupleR_2.12.0
DelayedArray_0.32.0 Deriv_4.1.6
digest_0.6.37 doBy_4.6.25
dorothea_1.18.0 dplyr_1.1.4
DT_0.33 dtplyr_1.3.1
e1071_1.7-16 evaluate_1.0.3
fansi_1.0.6 farver_2.1.2
fastmap_1.2.0 fontawesome_0.5.3
forcats_1.0.0 formatR_1.14
Formula_1.2.5 fs_1.6.5
futile.logger_1.4.3 futile.options_1.0.1
gargle_1.5.2 generics_0.1.3
GenomeInfoDb_1.42.3 GenomeInfoDbData_1.2.13
GenomicAlignments_1.42.0 GenomicRanges_1.58.0
ggforce_0.4.2 ggplot2_3.5.1
ggpubr_0.6.0 ggrepel_0.9.6
ggsci_3.2.0 ggsignif_0.6.4
glue_1.8.0 googledrive_2.1.1
googlesheets4_1.1.1 gprofiler2_0.2.3
graph_1.84.1 graphics_4.4.2
grDevices_4.4.2 grid_4.4.2
gridExtra_2.3 GSEABase_1.68.0
gtable_0.3.6 haven_2.5.4
here_1.0.1 highr_0.11
hms_1.1.3 htmltools_0.5.8.1
htmlwidgets_1.6.4 httpuv_1.6.15
httr_1.4.7 ids_1.0.1
igraph_2.1.4 IRanges_2.40.1
isoband_0.2.7 jquerylib_0.1.4
jsonlite_1.8.9 KEGGREST_1.46.0
kernlab_0.9-33 KernSmooth_2.23-26
knitr_1.49 labeling_0.4.3
lambda.r_1.2.4 later_1.4.1
lattice_0.22-6 lazyeval_0.2.2
lifecycle_1.0.4 lme4_1.1.36
lpSolve_5.6.23 lubridate_1.9.4
magick_2.8.5 magrittr_2.0.3
MASS_7.3-64 Matrix_1.7-2
MatrixGenerics_1.18.1 MatrixModels_0.5.3
matrixStats_1.5.0 memoise_2.0.1
methods_4.4.2 mgcv_1.9.1
microbenchmark_1.5.0 mime_0.12
minqa_1.2.8 mixtools_2.0.0
modelr_0.1.11 munsell_0.5.1
networkD3_0.4 nlme_3.1-167
nloptr_2.1.1 nnet_7.3.20
numDeriv_2016.8.1.1 openssl_2.3.2
parallel_4.4.2 parallelly_1.42.0
pbkrtest_0.5.3 permute_0.9.7
pheatmap_1.0.12 pillar_1.10.1
pixmap_0.4.13 pkgconfig_2.0.3
plogr_0.2.0 plotly_4.10.4
plyr_1.8.9 png_0.1-8
polyclip_1.10.7 polynom_1.4.1
prettyunits_1.2.0 processx_3.8.5
progress_1.2.3 promises_1.3.2
proxy_0.4-27 ps_1.8.1
purrr_1.0.4 qdapRegex_0.7.8
quantreg_6.0 R6_2.6.1
ragg_1.3.3 rappdirs_0.3.3
rbibutils_2.3 RColorBrewer_1.1.3
Rcpp_1.0.14 RcppArmadillo_14.2.3.1
RcppEigen_0.3.4.0.2 RcppTOML_0.2.2
RCurl_1.98.1.16 Rdpack_2.6.2
readr_2.1.5 readxl_1.4.3
reformulas_0.4.0 rematch_2.0.0
rematch2_2.1.2 reprex_2.1.1
reticulate_1.40.0 Rhtslib_3.2.0
rjson_0.2.23 rlang_1.1.5
rmarkdown_2.29 rprojroot_2.0.4
Rsamtools_2.22.0 RSQLite_2.3.9
rstatix_0.7.2 rstudioapi_0.17.1
RVenn_1.1.0 rvest_1.0.4
S4Arrays_1.6.0 S4Vectors_0.44.0
sass_0.4.9 scales_1.3.0
segmented_2.1-3 selectr_0.4.2
seqinr_4.2.36 SignalingProfiler_2.1.4
snow_0.4.4 sp_2.2.0
SparseArray_1.6.1 SparseM_1.84.2
splines_4.4.2 stats_4.4.2
stats4_4.4.2 stringi_1.8.4
stringr_1.5.1 SummarizedExperiment_1.36.0
survival_3.8-3 sys_3.4.3
systemfonts_1.2.1 textshaping_1.0.0
tibble_3.2.1 tidyr_1.3.1
tidyselect_1.2.1 tidyverse_2.0.0
timechange_0.3.0 tinytex_0.54
tools_4.4.2 tweenr_2.0.3
tzdb_0.4.0 UCSC.utils_1.2.0
UniprotR_2.4.0 utf8_1.2.4
utils_4.4.2 uuid_1.2.1
vctrs_0.6.5 vegan_2.6.10
viper_1.40.0 viridisLite_0.4.2
visNetwork_2.1.2 vroom_1.6.5
withr_3.0.2 xfun_0.50
XML_3.99.0.18 xml2_1.3.6
xtable_1.8.4 XVector_0.46.0
yaml_2.3.10 zlibbioc_1.52.0
>>>>>>> dc3384fd1d9390421f5c883dabd29d4188c3dc98